Picture for Hongsheng Li

Hongsheng Li

OneVLA: A Unified Framework for Embodied Tasks

Add code
May 31, 2026
Viaarxiv icon

UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents

Add code
May 28, 2026
Viaarxiv icon

OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants

Add code
May 26, 2026
Viaarxiv icon

Rethinking VLM Representation for VLA Initialization

Add code
May 25, 2026
Viaarxiv icon

Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning

Add code
May 20, 2026
Viaarxiv icon

MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving

Add code
May 14, 2026
Viaarxiv icon

Edit-Based Refinement for Parallel Masked Diffusion Language Models

Add code
May 10, 2026
Viaarxiv icon

Context Unrolling in Omni Models

Add code
Apr 23, 2026
Viaarxiv icon

Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

LMGenDrive: Bridging Multimodal Understanding and Generative World Modeling for End-to-End Driving

Add code
Apr 09, 2026
Viaarxiv icon